1,509 research outputs found
Online classification for time-domain astronomy
The advent of synoptic sky surveys has spurred the development of techniques
for real-time classification of astronomical sources in order to ensure timely
follow-up with appropriate instruments. Previous work has focused on algorithm
selection or improved light curve representations, and naively convert light
curves into structured feature sets without regard for the time span or phase
of the light curves. In this paper, we highlight the violation of a fundamental
machine learning assumption that occurs when archival light curves with long
observational time spans are used to train classifiers that are applied to
light curves with fewer observations. We propose two solutions to deal with the
mismatch in the time spans of training and test light curves. The first is the
use of classifier committees where each classifier is trained on light curves
of different observational time spans. Only the committee member whose training
set matches the test light curve time span is invoked for classification. The
second solution uses hierarchical classifiers that are able to predict source
types both individually and by sub-group, so that the user can trade-off an
earlier, more robust classification with classification granularity. We test
both methods using light curves from the MACHO survey, and demonstrate their
usefulness in improving performance over similar methods that naively train on
all available archival data.Comment: Astroinformatics workshop, IEEE International Conference on Data
Mining 201
Exploratory Analysis of Highly Heterogeneous Document Collections
We present an effective multifaceted system for exploratory analysis of
highly heterogeneous document collections. Our system is based on intelligently
tagging individual documents in a purely automated fashion and exploiting these
tags in a powerful faceted browsing framework. Tagging strategies employed
include both unsupervised and supervised approaches based on machine learning
and natural language processing. As one of our key tagging strategies, we
introduce the KERA algorithm (Keyword Extraction for Reports and Articles).
KERA extracts topic-representative terms from individual documents in a purely
unsupervised fashion and is revealed to be significantly more effective than
state-of-the-art methods. Finally, we evaluate our system in its ability to
help users locate documents pertaining to military critical technologies buried
deep in a large heterogeneous sea of information.Comment: 9 pages; KDD 2013: 19th ACM SIGKDD Conference on Knowledge Discovery
and Data Minin
TimeSets for uncertainty visualisation
TimeSets consist of a timeline showing sequence of events displayed across a visualisation, while makings sense of sets relation among events in the timeline [NXWW15]. This study looked into extending TimeSets to accommodate Visualisation of trust and uncertainty as parts of its variables for events displayed across the timeline. The aim of the challenge is to build tools in the context of big data analytics that can be used to aid military operations through intelligence analytics and decision-making
- …